This Quarto document serves as a practical illustration of the concepts covered in the productive workflow online course
1 Introduction
This document offers a straightforward analysis of the well-known penguin dataset. It is designed to complement the Productive R Workflow online course.
You can read more about the penguin dataset here.
Let’s load libraries before we start!
1.1 Loading data
The dataset has already been loaded and cleaned in the previous step of this pipeline.
Let’s load the clean version, together with a few functions available in functions.R.
1.2 Bill Length and Bill Depth
Now, let’s make some descriptive analysis, including summary statistics and graphs.
What’s striking is the slightly negative relationship between bill length and bill depth:
\[{\displaystyle Avg={\frac {1}{n}}\sum _{i=1}^{n}a_{i}={\frac {a_{1}+a_{2}+\cdots +a_{n}}{n}}}\]
Show the code
library(hrbrthemes)
palmerpenguins::penguins |>
filter(!is.na(sex)) |>
ggplot(
aes(x = bill_length_mm, y = bill_depth_mm)
) +
geom_point(color = "#69b3a2") +
labs(
x = "Bill Length (mm)",
y = "Bill Depth (mm)",
title = paste("Surprising relationship?")
) +
theme_ipsum()It is also interesting to note that bill length a and bill depth are quite different from one specie to another. This is summarized in the 2 tables below:
Show the code
# A tibble: 3 × 2
species average_bill_length
<chr> <dbl>
1 Adelie 38.8
2 Chinstrap 48.8
3 Gentoo 47.5
# A tibble: 3 × 2
species average_bill_depth
<chr> <dbl>
1 Adelie 18.3
2 Chinstrap 18.4
3 Gentoo 15.0
Now, let’s check the relationship between bill depth and bill length for the specie Adelie on the island Torgersen:
Show the code
# Use the function in functions.R
p1 <- create_scatterplot(data, "Adelie", "Torgersen")
p2 <- create_scatterplot(data, "Chinstrap", "Biscoe")
p3 <- create_scatterplot(data, "Gentoo", "Dream")
(p1 + p2) / p31.2.1 Displaying penguins data as a DT table
1.2.2
Show the code
library(tidyverse)
library(plotly)
library(hrbrthemes)
penguins <- palmerpenguins::penguins |>
filter(!is.na(sex)) |>
ggplot(
aes(x = bill_length_mm, y = bill_depth_mm)
) +
geom_point(color = "#69b3a2") +
labs(
x = "Bill Length (mm)",
y = "Bill Depth (mm)",
title = paste("Surprising relationship?")
) +
theme_ipsum()
ggplotly(penguins)Using a kable table
Show the code
| species | average_bill_length |
|---|---|
| Adelie | 38.79139 |
| Chinstrap | 48.83382 |
| Gentoo | 47.50488 |
Show the code
| species | average_bill_depth |
|---|---|
| Adelie | 18.34636 |
| Chinstrap | 18.42059 |
| Gentoo | 14.98211 |
For instance, the average bill length for the specie Adelie is 38.79.